Homework 3 Answers

BSTA 512/612

Due: Friday February 14, 2025 at 11pm
Author

Your name here!!!

Modified

December 12, 2025

Answers are not necessarily complete! This is just meant to serve as a check if you are stuck.

Questions

Question 1

This question and data are adapted from this textbook.

In an experiment designed to describe the dose–response curve for vitamin K, individual rats were depleted of their vitamin K reserves and then fed dried liver for 4 days at different dosage levels. The response of each rat was measured as the concentration of a clotting agent needed to clot a sample of its blood in 3 minutes. The results of the experiment on 12 rats are given in the following table; values are expressed in common logarithms for both dose and response.

clot = read_excel(here("data/CH05Q09.xls"))
clot %>% gt() %>%
  cols_label(RAT = md("**Rat**"),
             LOGCONC = md("**Log10 Concentration (Y)**"),
             LOGDOSE = md("**Log10 Dose (X)**"))
Rat Log10 Concentration (Y) Log10 Dose (X)
1 2.65 0.18
2 2.25 0.33
3 2.26 0.42
4 1.95 0.54
5 1.72 0.65
6 1.60 0.75
7 1.55 0.83
8 1.32 0.92
9 1.13 1.01
10 1.07 1.04
11 0.95 1.09
12 0.88 1.15

Use the log-transformed values as given in the dataset.

Use the following scatterplot to build your answers off of:

Part a

Here is the code for fitting the model:

summary(clot_mod)

Call:
lm(formula = LOGCONC ~ LOGDOSE, data = clot)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.097151 -0.026859 -0.003392  0.028279  0.095355 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.93620    0.04230   69.41 9.39e-15 ***
LOGDOSE     -1.78501    0.05267  -33.89 1.18e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.05589 on 10 degrees of freedom
Multiple R-squared:  0.9914,    Adjusted R-squared:  0.9905 
F-statistic:  1149 on 1 and 10 DF,  p-value: 1.182e-11

Part b

term df sumsq meansq statistic p.value
LOGDOSE 1 3.58845400 3.588454000 1148.759 1.182233e-11
Residuals 10 0.03123767 0.003123767 NA NA

Part c

\[ F = 1148.759 \]

Part d

Not shown bc answer is complete solution

Question 2

Part a

[1] -0.9956757

Part b

term df sumsq meansq statistic p.value
LOGDOSE 1 3.58845400 3.588454000 1148.759 1.182233e-11
Residuals 10 0.03123767 0.003123767 NA NA
[1] 0.9913701

Part c

Not given

Question 3

if(!require(Sleuth3)) { install.packages("Sleuth3"); library(Sleuth3) }
q1_data = ex0824 

Part a

Part b

Not given

Part c

Fit the regression model, display the regression table, and write out the fitted regression line.

term estimate std.error statistic p.value
(Intercept) 47.0521633 0.50421678 93.31733 0.000000e+00
Age -0.6957134 0.02937509 -23.68378 1.168799e-88

\[\begin{aligned} \widehat{\text{RR}} &= 47.05 -0.7 \cdot \text{Age} \end{aligned}\]

Part d

[1] "Rate"       "Age"        ".fitted"    ".resid"     ".hat"      
[6] ".sigma"     ".cooksd"    ".std.resid"
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Part e

Part f

statistic p.value method
0.9782974 6.177141e-08 Shapiro-Wilk normality test

Part g

Part h

# A tibble: 0 × 8
# ℹ 8 variables: Rate <int>, Age <dbl>, .fitted <dbl>, .resid <dbl>,
#   .hat <dbl>, .sigma <dbl>, .cooksd <dbl>, .std.resid <dbl>

Part i

# A tibble: 618 × 8
    Rate   Age .fitted .resid    .hat .sigma .cooksd .std.resid
   <int> <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>      <dbl>
 1    78   1.9    45.7   32.3 0.00347   7.74  0.0296       4.12
 2    75   0.5    46.7   28.3 0.00395   7.76  0.0259       3.62
 3    73   3.1    44.9   28.1 0.00310   7.77  0.0201       3.59
 4    72   1.8    45.8   26.2 0.00350   7.78  0.0197       3.35
 5    70   2.5    45.3   24.7 0.00328   7.78  0.0164       3.15
 6    69   1.9    45.7   23.3 0.00347   7.79  0.0154       2.97
 7    69   4.5    43.9   25.1 0.00273   7.78  0.0140       3.20
 8    66   4.9    43.6   22.4 0.00263   7.80  0.0107       2.85
 9    27   1.5    46.0  -19.0 0.00360   7.81  0.0107      -2.43
10    48  25.3    29.5   18.5 0.00361   7.81  0.0102       2.37
# ℹ 608 more rows

Part j

`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Part k

Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
ℹ Please use `after_stat(density)` instead.
ℹ The deprecated feature was likely used in the describedata package.
  Please report the issue to the authors.

Part l

Not given

Part m

Part n

For the log-transformed Rate:

q1_model2 = lm (Rate_log ~ Age, data = q1_data2)
q1_mod_t2 = tidy(q1_model2)
q1_mod_t2 %>% gt()
term estimate std.error statistic p.value
(Intercept) 3.84511854 0.0126276539 304.49984 0.000000e+00
Age -0.01900896 0.0007356727 -25.83888 2.740318e-100

Part o

For log-transformed Rate:

Part p

For log-transformed Rate:

Part q

Not given

Question 4

dep_df = read_sas(here("data/completedata.sas7bdat"))

Part a

term estimate std.error statistic p.value conf.low conf.high
(Intercept) 6.4144 2.0501 3.1288 0.0018 2.3882 10.4406
Fatalism 0.1527 0.0452 3.3784 0.0008 0.0639 0.2414
Optimism −0.3179 0.0722 −4.4058 0.0000 −0.4596 −0.1762
Spirituality 0.3587 0.1291 2.7781 0.0056 0.1051 0.6122

Another fun way to display:

tbl_regression(q2_mod_f1, intercept = T)
Characteristic Beta 95% CI1 p-value
(Intercept) 6.4 2.4, 10 0.002
Fatalism 0.15 0.06, 0.24 <0.001
Optimism -0.32 -0.46, -0.18 <0.001
Spirituality 0.36 0.11, 0.61 0.006
1 CI = Confidence Interval

Part b

  • \(\beta_0\): The expected depression score is 6.4 when fatalism, depression, and spirituality scores are 0 (95% CI: 2.4, 10.4).

    • Same as homework 2: The intercept does not make sense. A score of 0 is outside the range of possible scores for fatalism, optimism, and spirituality.
  • \(\beta_1\): For every 1 point higher fatalism score, there is an expected difference of 0.15 points higher depression score, adjusting for optimism and spirituality score (95% CI: 0.06, 0.24).

Part c

Not given

Part d

\[\begin{aligned} \widehat{\text{Depression}} &= 5.39 + 0.15 \cdot \text{Fatalism} \end{aligned}\]